Skip to content

Conversation

@rekmarks
Copy link
Member

@rekmarks rekmarks commented Apr 18, 2025

Closes: #369

Replaces all kernel and vat "message" types with the new RpcClient / RpcService pattern. For related work and the justification for this, see:

Note that, in the course of implementing this refactor, it was discovered that we are never returning any responses to syscalls originating from vat supervisors. This PR maintains a provisional solution using the new pattern, and #488 tracks the work to actually fix this.

@rekmarks rekmarks mentioned this pull request Apr 18, 2025
5 tasks
@rekmarks rekmarks changed the title refactor: Replace IPC conventions with RpcClient and RpcService refactor: Replace "message" IPC pattern with RpcClient and RpcService Apr 18, 2025
@rekmarks rekmarks mentioned this pull request Apr 18, 2025
@rekmarks rekmarks marked this pull request as ready for review April 18, 2025 22:30
@rekmarks rekmarks requested a review from a team as a code owner April 18, 2025 22:30
*/
export function makeVatKVStore(state: Map<string, string>): VatKVStore {
let sets: Map<string, string> = new Map();
let sets: Record<string, string> = {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really wants to be a Map. Using an object as a map takes the JS engine down the slow path; people doing this a lot was one of the motivations for the addition of the Map class in the first place.

Copy link
Member Author

@rekmarks rekmarks Apr 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think keeping it as a map will be more performant even if we have to convert it into an object in checkpoint()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think possibly we may be hitting one of the (many) downsides to using JSON-RPC (or at least the JSON-RPC libraries) as the interprocess message transport (unless those libraries use a smarter serialization implementation than JSON.stringify()). Either you need to convert a map to an object in order to serialize it (which is expensive) or use an object as a map (which is expensive).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On balance, keeping it as a Map is probably better because it's only serialized at the end of the crank, whereas it can be modified a lot during the crank. Also (as you'll see from some of my other comments below) it seems to get used in a bunch of contexts where that serialization is not a thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VatSupervisor.#initVat() previously return ROOT_OBJECT_VREF, but the
corresponding "initVat" RPC method expected a vat checkpoint. Prior to
this PR, we never actually did anything with the #initVat() return
value, so it now just returns the vat checkpoint directly.
grypez
grypez previously requested changes Apr 21, 2025
Copy link
Contributor

@grypez grypez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits.

@rekmarks rekmarks requested a review from grypez April 21, 2025 18:58
@rekmarks
Copy link
Member Author

@FUDCo re: VatStore / VatCheckpoint: 4ea1172

@rekmarks rekmarks requested a review from FUDCo April 21, 2025 21:27

// XXX TODO: Actually handle the syscall results
this.#syscallsInFlight.length = 0;
this.#rpcClient.rejectAll(new Error('end of crank'));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely not following the use of rejection here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it doesn't really make sense on its own merits. However, what we're doing here is a stopgap measure. Outside of this specific and temporary use case, I can't think of a reason to give the RpcClient a resolveAll() method. So, I thought it best to use rejectAll() until we actually start handling the syscalls. Nobody is awaiting these promises at present anyway.

If we want to do anything different here right now, we could just get rid of #syscallsInFlight entirely, but IMO it's a pretty harmless breadcrumb as things stand.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syscallsInFlight was collecting the promises being generated by each syscall to the kernel because the transport spat one out for each write, which were then fed to a Promise.all() that basically expected them to all succeed (which, given how this works, they all did). But if an error in a write did occur, something would at least complain. Here we seem to be doing the same accumulating, just for #rpcClient.call() calls instead of #commandStream.write() calls. But as far as I can tell it's not actually doing anything with the array except discarding it at the end of the crank. Are these promises also being accumulated somehow inside #rpcClient? And what does the call to #rpcClient.rejectAll actually reject? And where does the Error that we construct here go?

Copy link
Member Author

@rekmarks rekmarks Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To recap for those following along at home: the context here is that there was never any response handling on the other end, and we are going conclusively fix this in #488 / #489.

Now:

But if an error in a write did occur, something would at least complain.

Yes, currently on main, the #syscallsInFlight promises are actually for the completion of the transport's write operation as opposed to the receipt of a response. It is almost impossible for us to mess up so badly as to cause writes to fail, so those promises in all cases resolve with undefined.

In this implementation, we instead collect the promises from #rpcClient.call(), which resolves with the result of the call. Unless something goes wrong in the handling of the syscall, this is (provisionally) always ['ok', null]. (For where this happens, see the initialization of this.#rpcService in the VatHandle constructor.) One of the problems with this until we conclusively fix things in #488 / #489 is that that the vat consistently receives deliver messages before all of the syscall responses arrive. Because we clear #syscallsInFlight in #deliver(), the syscall responses become orphaned, and the #rpcClient complains about this via #logger.debug(). To decrease the log noise, I decided to just reject all of the #syscallsInFlight promises and stick a no-op catch handler on them.

This is obviously extremely contrived, and if we can deal with some occasional log noise (mostly in kernel-test, because Node.js doesn't have an inherent notion of log levels), we can make the vat aware if any of the syscalls actually fail on the kernel side (even if we don't really handle that failure right now).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, an egregious hack, which seems entirely appropriate for the situation at hand.

@grypez grypez removed their request for review April 21, 2025 23:40
@rekmarks rekmarks force-pushed the rekm/rpc-vat-kernel branch from 6eefcdb to b5c3a7d Compare April 21, 2025 23:41
@grypez grypez dismissed their stale review April 21, 2025 23:42

Addressed

Copy link
Contributor

@FUDCo FUDCo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@rekmarks rekmarks merged commit a58eb21 into main Apr 22, 2025
21 checks passed
@rekmarks rekmarks deleted the rekm/rpc-vat-kernel branch April 22, 2025 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Simplify command and stream types

5 participants